OcrV1, Main, Exploration, bibRecord, 000581

Digitalisierung historischer Zeitungen aus dem Blickwinkel der automatisierten Text- und Strukturerkennung (OCR)

Identifieur interne : 000581 ( Main/Exploration ); précédent : 000580; suivant : 000582

Digitalisierung historischer Zeitungen aus dem Blickwinkel der automatisierten Text- und Strukturerkennung (OCR)

Auteurs : Günter Mühlberger [Autriche]

Source :

Zeitschrift für Bibliothekswesen und Bibliographie [ 0044-2380 ] ; 2011.

RBID : Pascal:11-0198412

Descripteurs français

Pascal (Inist)
- Technologie information communication, Reconnaissance optique caractère, Numérisation.
Wicri :
- topic : Numérisation.

English descriptors

KwdEn :
- Digitizing, Information communication technology, Optical character recognition.

Abstract

OCR recognition is a key technology which cannot be circumvented when systematically digitizing historical newspapers. Although often achieving a word accuracy of only 80% or less for newspapers of the 19th and early 20th century, these imperfect files nevertheless provide a basis for a number of interesting applications - from full-text searching to indexing by search engines and online correction by users. However, in comparison to traditional digitization projects, the use of OCR requires a fundamental change of thinking during the project planning, the design of the workflow, the implementation of quality control, and in the designing of long-term preservation and presentation of digitized material on the Internet.

Affiliations:

Autriche

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: 000146
to stream PascalFrancis, to step Corpus: 000157
to stream PascalFrancis, to step Curation: 000627
to stream PascalFrancis, to step Checkpoint: 000135
to stream Main, to step Merge: 000587
to stream Main, to step Curation: 000581

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="GER" level="a">Digitalisierung historischer Zeitungen aus dem Blickwinkel der automatisierten Text- und Strukturerkennung (OCR)</title>
<author><name sortKey="Muhlberger, Gunter" sort="Muhlberger, Gunter" uniqKey="Muhlberger G" first="Günter" last="Mühlberger">Günter Mühlberger</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Universitäts- und Landesbibliothek Tirol, Abteilung für Digitalisierung und elektronische Archivierung, Innrain 52,</s1>
<s2>6020 Innsbruck</s2>
<s3>AUT</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Autriche</country>
<wicri:noRegion>6020 Innsbruck</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">11-0198412</idno>
<date when="2011">2011</date>
<idno type="stanalyst">PASCAL 11-0198412 INIST</idno>
<idno type="RBID">Pascal:11-0198412</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000146</idno>
<idno type="stanalyst">FRANCIS 11-0198412 INIST</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000157</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000627</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000135</idno>
<idno type="wicri:doubleKey">0044-2380:2011:Muhlberger G:digitalisierung:historischer:zeitungen</idno>
<idno type="wicri:Area/Main/Merge">000587</idno>
<idno type="wicri:Area/Main/Curation">000581</idno>
<idno type="wicri:Area/Main/Exploration">000581</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="GER" level="a">Digitalisierung historischer Zeitungen aus dem Blickwinkel der automatisierten Text- und Strukturerkennung (OCR)</title>
<author><name sortKey="Muhlberger, Gunter" sort="Muhlberger, Gunter" uniqKey="Muhlberger G" first="Günter" last="Mühlberger">Günter Mühlberger</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Universitäts- und Landesbibliothek Tirol, Abteilung für Digitalisierung und elektronische Archivierung, Innrain 52,</s1>
<s2>6020 Innsbruck</s2>
<s3>AUT</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Autriche</country>
<wicri:noRegion>6020 Innsbruck</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Zeitschrift für Bibliothekswesen und Bibliographie</title>
<title level="j" type="abbreviated">Z. Bibliothekswes. Bibliogr.</title>
<idno type="ISSN">0044-2380</idno>
<imprint><date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Zeitschrift für Bibliothekswesen und Bibliographie</title>
<title level="j" type="abbreviated">Z. Bibliothekswes. Bibliogr.</title>
<idno type="ISSN">0044-2380</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Digitizing</term>
<term>Information communication technology</term>
<term>Optical character recognition</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Technologie information communication</term>
<term>Reconnaissance optique caractère</term>
<term>Numérisation</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Numérisation</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">OCR recognition is a key technology which cannot be circumvented when systematically digitizing historical newspapers. Although often achieving a word accuracy of only 80% or less for newspapers of the 19th and early 20th century, these imperfect files nevertheless provide a basis for a number of interesting applications - from full-text searching to indexing by search engines and online correction by users. However, in comparison to traditional digitization projects, the use of OCR requires a fundamental change of thinking during the project planning, the design of the workflow, the implementation of quality control, and in the designing of long-term preservation and presentation of digitized material on the Internet.</div>
</front>
</TEI>
<affiliations><list><country><li>Autriche</li>
</country>
</list>
<tree><country name="Autriche"><noRegion><name sortKey="Muhlberger, Gunter" sort="Muhlberger, Gunter" uniqKey="Muhlberger G" first="Günter" last="Mühlberger">Günter Mühlberger</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000581 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000581 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:11-0198412
   |texte=   Digitalisierung historischer Zeitungen aus dem Blickwinkel der automatisierten Text- und Strukturerkennung (OCR)
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Digitalisierung historischer Zeitungen aus dem Blickwinkel der automatisierten Text- und Strukturerkennung (OCR)

Digitalisierung historischer Zeitungen aus dem Blickwinkel der automatisierten Text- und Strukturerkennung (OCR)

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri